## **** __Utilized Cores__ **** = 2$subsetGenes
## [1] "protein_coding"
## 
## $subsetCells
## [1] 500
## 
## $resolution
## [1] 0.6
## 
## $resultsPath
## [1] "./Results"
## 
## $nCores
## [1] 2
## 
## $perplexity
## [1] 30

Louvain Modules

  • Identify co-expression modules.
    find_gene_modules essentially runs UMAP on the genes (as opposed to the cells) and then groups them into modules using Louvain community analysis.
  • Advantages: Much faster than other alogrithms like WGCNA.
  • Disadvantages: Seems to be limited of ~1000 genes.

3D Scatter

Bulk RNAseq Modules vs. scRNAseq Modules

  • Co-expression modules were identified through the WGCNA analysis of bulk monocyte RNA-seq data, conducte by Katia Lopes.
  • Here we compare the modules identified in the scRNA-seq data here (sc.modules), to the modules identified in the bulk monocyte data (bulk.modules).
## Warning in instance$preRenderHook(instance): It seems your data is too
## big for client-side DataTables. You may consider server-side processing:
## https://rstudio.github.io/DT/server.html

gprofiler Enrichment: sc.modules

## [1] "gprofiler2:: Running enrichment on module: 4"
## [1] "gprofiler2:: Running enrichment on module: 15"
## [1] "gprofiler2:: Running enrichment on module: 7"
## [1] "gprofiler2:: Running enrichment on module: 3"
## [1] "gprofiler2:: Running enrichment on module: 13"
## [1] "gprofiler2:: Running enrichment on module: 5"
## [1] "gprofiler2:: Running enrichment on module: 8"
## [1] "gprofiler2:: Running enrichment on module: 2"
## [1] "gprofiler2:: Running enrichment on module: 14"
## [1] "gprofiler2:: Running enrichment on module: 9"
## [1] "gprofiler2:: Running enrichment on module: 11"
## [1] "gprofiler2:: Running enrichment on module: 12"
## [1] "gprofiler2:: Running enrichment on module: 16"
## [1] "gprofiler2:: Running enrichment on module: 17"
## [1] "gprofiler2:: Running enrichment on module: 1"
## [1] "gprofiler2:: Running enrichment on module: 6"
## [1] "gprofiler2:: Running enrichment on module: 10"
## [1] "gprofiler2:: 17 tested for ontological enrichment in 49.63 seconds"

Project Bulk Modules to Single-cell Data

  • Replace the discrete gene module assignments from sc.modules (UMAP+Louvain) to bulk.modules (WGNCA) from here.
  • The “NA” module simply represents a collection of genes that were present in the sc dataset but not in the bulk dataset.

PD vs. Control DGE Modules

  • Plot only the module that were differentially expressed (Wilcoxon nominal P-value < 0.05) between PD and controls in the bulk data.
  • Also plot darkmagenta, which is of interest due to its enrichemnt in mitochondrial GO terms.

Results Summary
- Bulk.modules differentially expressed between PD-Controls
+ Preferentially expressed in canonical monocytes + darkmagenta (particularly the cells furthest from intermediate monocytes, suggesting these cells are the least activated) + pink
+ honeydew1
+ Preferentially expressed in intermediate monocytes
+ purple (not clear separation but trending)

All Modules

I also plotted all 60+ of the bulk.modules (ordered alphabetically) to see if any of them were preferentially expressed in a a given monocyte subtype. Darkmagenta remains one of the most striking examples, but there are other as well.

  • Preferentially expressed in canonical monocytes
    • grey80
    • ivory
  • Preferentially expressed in intermediate monocytes
    • firebrick4
    • orangered4
    • navajowhite2
    • red (trending)

Module-Module Overlap

  • Calculate enrichment scores for all combinations of bulk RNAseq and scRNAseq gene modules.
  • This tells you whether some modules are detectable in both datasets.
## [1] "mod.df1 contains 66 unique modules."
## [1] "mod.df2 contains 17 unique modules."
## [1] "Conducting enrichment tests on 1122 module-module combinations..."
## [1] "1122 enrichment tests conducted in 1.52 seconds."
## [1] "24 enrichment tests were significant (at FDR ≤ 0.05)."
## [1] "22.73% of mod.df1 modules showed enrichment for a mod.df2 module."
## [1] "82.35% of mod.df2 modules showed enrichment for a mod.df1 module."

Plot Module-Module Overlap

## Registered S3 method overwritten by 'seriation':
##   method         from 
##   reorder.hclust gclus

Module GO Enrichment

  • Iteratively test for enrichment of ontological terms in all modules (both bulk and single-cell).
  • Take a while for long lists of modules (e.g. the 60+ bulk-RNA-seq modules).
## [1] "Module GO Enrichment file detected. Importing... ./Results/gprofiler2.module.enrichment.txt"

Module-Module Similarity: Gene-level vs. Term-level

  • Compute how similar each of the top conserved module-module pairs are in terms of GO enrichment.
## [1] "Comparing overlap of enrichment terms for all module-module combinations."

Scatterplot

  • See how well module-module similarity scores (Jaccard index) correlate between the gene-level comparisons vs. the term-level comparisons.
  • Varying how many of the top enrichment terms to use to calculate term-level module-module similarity (e.g. 10, 50, 100, all) did not seem to signficantly impact the correlation between methods (generaly hovered between Pearson’s r = .4-.5).
  • Label each pair of modules in the scatterplot using the top ontological enrichment terms for each module respectively.
  • Jaccard.sum is the sum of both the gene-level and term-level Jaccard similarity scores.
    • This can be used as a way to identify module-module pairs that were similar at multiple levels (and thus have strong evidence of conservation).